An Automatic Identification of Lung Cancer from different types of Medical Images
Gayathri K1*, Vaidhehi V2
Department of Computer Science, Christ (Deemed to be University), Hosur Road, Bangalore
*Corresponding Author E-mail: gayuknaidu@gmail.com
ABSTRACT:
Identification of lung cancer from the medical images is the most difficult task. The objective of this research work is to identify the cancerous and non-cancerous lung which is taken from different medical images like Computer Tomography medical images and Positron Emission Tomography medical images. The proposed algorithm is used to predict lung cancer by using different image processing techniques. It is divided into four stages such as pre-processing, binarization, segmentation and thresholding. This research paper ensures that the image quality is retained effectively thereby extracting appropriate features for identifying cancerous and non-cancerous lung. The algorithm is trained and tested for cancerous and non-cancerous images.
KEYWORDS: Lung cancer, pre-processing, binarization, segmentation, thresholding. Terminalia arjuna stem bark, Glycyrrhiza glabra Roots, Phytochemicals.
INTRODUCTION:
These medical images can be interpreted only by medical experts like doctors, lab technicians, etc. The medical images can be represented as a set of numbers that are stored and handled by digital image processing. To design the automatic identification system, these medical images are processed by digital image processing techniques. These techniques are used in various medical field for diagnosis and predicting different types of cancer like lung, liver, breast, etc.
Different types of medical images are taken by using Computer Tomography (CT) medical images and Positron Emission Tomography (PET) medical images. CT image gives a doctor a very clear picture of a tumor and its position. Generally CT images provide an outline structure of the organ. When structure changes are diagnosed, the abnormality of the organ is detected. When the abnormality is detected further investigation like a biopsy is recommended by the doctor as more internal details of the organ are not captured in CT images. Also, it causes high radiation. PET images give a doctor about the bodily functions through biochemical processes. Using these images, internal abnormalities can be detected. These images will be able to detect a tumor before the patient expressing any symptoms. When abnormality of the organ is detected using PET images then different medications can be done deepening upon the level of abnormality. Though, radioactive components of PET scanning procedure will not last for a longer period, exposure itself will harm the human body. Generally doctors will prescribe different medical imaging procedure to patients based on their clinical reports.
As different types of medical images like CT, PET image are available to identify cancer in different organs, this research paper is designed to automatically identify lung cancer using CT and PET medical images. The uncontrolled growth of cells in the lung leads to lung cancer. Primary cancer is the one which starts at the lung which has symptoms like coughing, chest pain, weight loss, etc. Automatic detection of lung cancer can be designed by using the image processing technique. The important requirement in processing the image is to build up pixel intensity by changing the discrete to the digital image, segmentation of an appropriate image, doing mathematical operations on pixels, and recreating the image with good quality.
Detecting the cancerous level at the starting stage is difficult. The proposed algorithm includes a number of stages to detect the cancerous and non-cancerous lung. The input image is pre-processed. After pre-processing, the image is then binarized. Then the image is segmented into left and right lung. After segmentation, the pixel value is extracted for thresholding. Based upon the threshold value, the lung is been detected as cancerous lung or not. The proposed algorithm gives an efficient result which has been trained for 48 CT images and 22 PET images that are taken from the source cancerimagingarchive. It is tested for 9 PET and 8 CT images and the accuracy is found to be 82.35.
This paper is structured as follows: Section 2 summarizes the existing research activities being carried out in this domain. Section 3 elaborates on the methodology being followed in this research work. Section 4 explains the results. Section 5 briefs about the concluding remarks of the paper.
RELATED WORKS:
Most cancers are also known as primary lung cancers are carcinomas which start it in the lungs. The different types of lung cancer are discussed in [1]. Samuel H Hawkins et. al [2] predicted the different types of cancer using classifiers. Joseph A, et al. [3] evaluated the performance of different machine learning algorithms in lung cancer prediction. Sundararajan et.al. [4] proposed a support vector machine for the detection of pneumoconiosis based on various textural features of disjoint segments of the lung. The number of people suffering from lung cancer is increased every year [5].
Samuel Cheng et.al. [6] proposed an algorithm for malignant nodule detection using watershed segmentation method for CT images. Disha and Gagandeep [7] proposed a CAD system using image slicing algorithm in which wiener filter was used to remove the noise content. Various image enhancement techniques like Gabor filter, auto enhancement algorithm FFT are used in [8] and segmentation of the enhanced image is done by thresholding approach and watershed segmentation approach [8].
Ankit Agrawal [9] designed a lung cancer outcome calculator. Iwano et al [10] designed a CAD system to automatically classify the nodules based on shape features. Authors [11] proposed a new segmentation method based on a Bayesian framework for CT images. Neha et.al [12] presents two segmentation methods, Hopfield neural network and a fuzzy c-mean clustering algorithm, for segmenting sputum color images to detect the lung cancer in its early stages. Lung cancer detection by using the artificial neural network and fuzzy clustering methods is discussed in [13].
The use of radiation therapy for lung cancer is discussed in [14]. Davis et al used a gray level co-occurrence matrix to find the features that are to be generated based on a pixel’s neighborhood [15]. The segmenting complex images using multilevel thresholding are implemented in [16]. Computer-aided diagnosis of glaucoma detection using digital fundus image is elaborated in [17]. A client-server based system for maintaining X-Ray images with patient information is implemented in [18].
An automatic method based on the subtraction between two serial mass chest radiographs to detect new lung nodules is discussed in [19]. Marshal Tariq et al. [20] proposed a morphological reconstruction based segmentation technique with the special focus on early tumor detection. Using Expectation maximum algorithm in the segmentation process, to find the locally maximum likelihood parameters from a statistical model of the images is explained in [21]. Multiscale decomposition based texture analysis and spatial clustering is used for better classification and segmenting tissues [22]. The Gabor filter based approaches and quaternionic Gabor filter based approach to color texture segmentation is explained in [23].
It is clear from the literature that the image processing techniques are widely used in understanding the medical images. Various research activities are based on different pre-processing techniques and its significance in the performance of the prediction system is studied by many researchers. Also, the performance of various machine learning algorithms to predict lung cancer using medical images is explored. In the literature either CT images or PET medical images are predominantly used by different algorithms to identify lung cancer. But, the doctor will prescribe different screening procedure for patients. Therefore, a generalized algorithm using both PET and CT images is required to identify lung cancer. The current research paper proposes an algorithm which works by considering different image sources and is elaborated in the next section.
METHODOLOGY:
The proposed algorithm uses a different input source such as CT and PET images. After the image is been taken from the source, the input is been pre-processed. After pre-processing, the image is binarized. Then the image is segmented into left and right lung. After segmentation, the pixel value is identified for thresholding. Based on the threshold value, the lung is detected as cancerous and non-cancerous. The overview of the research work is shown in figure 1. The detailed architecture of the research work is shown in figure 2.
Fig 1: Lung Cancer Detection Block Diagram
Fig 2: Architecture of Preventing Lung Cancer
Image Dataset:
The dataset consists of 50 CT and 50 PET images of each patient taken from the source cancerimagingarchive. These images are classified as a cancerous and non-cancerous image of a lung. The cancerous images are further classified as affected, not affected and partially affected. Out of 100 images 30 images could not be processed. Remaining 70 images are distributed as follows: 3 images are identified as right lung completely affected, 8 images are identified as left lung completely affected, 13 images are identified as right lung not affected, 12 images are identified as left lung not affected, 17 images are identified as right lung partially affected and 17 images are identified as left lung partially affected. The images are in jpg file format. These 70 images are considered for pre-processing. The sample CT images of cancerous and non-cancerous lung taken from datasets are shown in figure 3. Single CT image of a non-cancerous lung is shown in figure 4. The sample PET images of cancerous and non-cancerous lung taken from datasets are shown in figure 5. A single PET image of a cancerous lung is shown in figure 6.
Fig 3: Sample CT Image
Fig 4: Single CT Image
Fig 5: Sample PET Image taken from Dataset
Fig 6: Single PET Image
Pre-processing:
The selected images from the datasets are pre-processed as follows:
1) Conversion of images into Gray Scale:
Grayscale conversation is to reduce the complexity from 3D pixel value to 1D value. Gray image is obtained by converting a color image by destroying all the hue information and saturation information.
2) Normalization:
Normalization changes the range of intensity value of the image which has poor contrast. The input image is normalized. The optimized size of the normalized image is found to be 400 X 300 pixels. It is found that this resized image provides the complete information.
3) Removing the Noise:
Noise removal is used to enhance the clarity of the image. After normalization, noise is been removed. The different types of noise like pepper and salt noise are removed by using a median filter.
4) Binary Image:
In order to find a region of interest in the image, the input image is converted to binary. After noise removal, the image is converted into binary. The binary image will have the pixel value 0’s (white) and 1’s (black).
Binarizing:
The image Binarizing is a method that changes the pre-processed input source into a black and white image. Binarization technique is the pre-processing technique where only black color and white color are used for a binary image. These images are also termed as bi-level images which mean that every pixel has either 1 or 0 as its value.
Image Segmenting:
Segmenting is the method by breaking down an input source into different segments. Segmenting the image is for finding objects or related details in the input source. The purpose of segmenting the image is to represent the image in a meaningful way in order to analyze it. By using existing segmentation algorithm like watershed algorithm [6] [8], the images are segmented into right and left lung for further processing.
Thresholding Method:
The Threshold is the easy method of segmentation. Binary images are found by thresholding the grayscale image. The approach of thresholding replaces every pixel in an image with a black pixel if the image intensity is less than some fixed constant {\displaystyle I_{i,j}<T}or a white pixel if the image intensity is greater than that constant. The proposed algorithm uses different threshold values i.e. thresh 1 and thresh 2 thresh 3. In the CT or PET input source, the white pixel per cent is more then thresh 1 the CT or PET input source is cancerous. From CT or PET input source, the white pixel per cent is not more then thresh 2 the CT or PET input source is non-cancerous. Otherwise if white pixels per cent are lesser then thresh 3 the CT or PET input source is partially cancerous.
PROPOSED ALGORITHM:
To implement the lung cancer detection, MATLAB tool is been used. MATLAB tool is a programming language for visualization and application development. MATLAB tool is the widely used tool for image processing technique. The algorithm for implementing the above-mentioned procedure is as follows
Algorithm: Lung Cancer Identification ()
Let n be number of images; let wp[] to store the total number of white pixels for every image; let bp[] to store total number of black pixels for every image; let tw to store the total number of white pixels; let tb to store the total number of black pixels; let aw to store average of white pixels; let ab to store to average of black pixels; Threshold is set as thresh1, thresh2, thresh3. Initialize the variables to zero.
BEGIN
//Pre-processing
Step 1: For each image do the following
Step 1.1: Input image as CT or PET
Step 1.2: If the input image is color then
The Image is converted to gray
Step 1.3: Normalized the image
Step 1.4: Identify the optimized image size
Step 1.5: If the image has noise then use filters to reduce noise
Step 1.6: To find the pixel value in the image, the image is binarized
End for
//binarization
Step 2: For each pre-processed image do the following
Step 2.1 Find the total white pixel value (tw)
Step 2.2 Find the total black pixel value (tb)
End for
Step 3: Average of the tw and tb is found and are stored in aw and ab
Step 4: Threshold value is obtained that is thresh1
Step 5: If tw value is greater than thresh1 then
Cancerous Lung
//segmentation
Step 6: For every binarized image do the following
Step 6.1 Use watershed algorithm for segmenting the images.
Step 6.2: To store the images separately.
End for
Step 6.3 For each right lung segmented image do the following
Step 6.3.1: The total pixel, tw and tb is found for each image
End for
Step 6.4: Average of the tw and tb is found and are stored in aw and ab
Step 6.5: By the average the threshold is obtained.
Step 6.6: If the true pixel value is more than thresh2
Cancerous right Lung
else if the false pixel is more than thresh2
Non-cancerous right Lung
else if the false pixel is lesser than thresh3 then
Right Lung is partially affected
End if
Step 6.7 For each left lung segmented image do the following
Step 6.7.1: The total pixel, tw and tb is found for each image
End for
Step 6.8: Average of the tw and tb is found and are stored in aw and ab
Step 6.9: By the average the threshold is obtained.
Step 6.10: If the strue pixel value is more than thresh3
Cancerous left Lung
else if the false pixel is more than thresh3
Non-cancerous left Lung
else if the false pixel is lesser than thresh3 then
Left Lung is partially affected
End if
END
Thresholding technique is the process of segmenting the image. Once the input image is binarized, the thresholded value is calculated by which the cancerous or non-cancerous lung is detected successfully.
RESULTS AND DISCUSSION:
Figure 7 demonstrates the sample CT lung image taken from the datasets. The encircled portion of the image indicates the cancerous lung. The results of the proposed algorithm for a sample image is shown in figure 8 to figure 12. Figure 8 demonstrates the pre-processes step of the sample CT image that includes removing noise content in the above image, resizing the image into a particular pixel range, etc. Figure 9 demonstrates the binarization of the pre-processed CT image that converts gray image into a black and white image which contains only black and white pixels for further process. Figure 10 and 11 demonstrate the segmentation of the binarized CT image which uses the Watershed algorithm to segment the lung into right and left. Figure 12 demonstrates the threshold process of the left segmented image which shows that the true pixel (112833) is greater than the false pixel (85755) that causes left lung affecting completely.
Fig 7: Sample CT Image of Lung Cancer
Fig 8: Pre-processed CT Lung Image
Fig 9: Binarized CT Lung Image
Fig 10: Segmented Left lung CT Image
Fig 11: Segmented Right lung CT Image
Fig 12: Completely Affected CT Image
The evaluation of the proposed algorithm is performed by keeping 53 images for training the algorithm to identify the threshold value. 17 images are considered for testing. During the training process, the threshold values are determined. Random sampling is applied from the dataset of 70 images. Out of 17 images 14 images are tested correctly, 3 images are not identified correctly. The accuracy is calculated to be 82.35. Summary of the result is tabulated in Table 1. When false acceptance rate and false rejection rate intersect they form an equal error rate which is been calculated to be 0.0588. Summary of the equal error rate is tabulated in Table 2.
Table 1: Results
Image |
Height |
Width |
Total Pixel |
True Pixel |
False Pixel |
Result |
Tested Result |
Falsely Rejected |
Falsely Accepted |
PET |
289 |
214 |
61846 |
37429 |
24417 |
Left Lung Completely Affect |
Yes |
No |
No |
CT |
309 |
194 |
59946 |
46657 |
13289 |
Right Lung Completely Affect |
Yes |
No |
Yes |
CT |
321 |
218 |
73978 |
55318 |
18660 |
Left Lung Completely Affect |
No |
Yes |
No |
CT |
309 |
194 |
69956 |
48647 |
21309 |
Right Lung Completely Affect |
Yes |
No |
No |
PET |
309 |
194 |
62956 |
49669 |
13289 |
Right Lung Completely Affect |
Yes |
No |
Yes |
PET |
289 |
214 |
65826 |
39439 |
26387 |
Left Lung Completely Affect |
Yes |
No |
No |
CT |
351 |
269 |
94419 |
36530 |
57889 |
Left Lung Not Affect |
Yes |
No |
No |
PET |
292 |
273 |
82716 |
23370 |
59346 |
Left Lung Not Affect |
Yes |
No |
Yes |
CT |
641 |
333 |
213453 |
91894 |
121559 |
Right Lung Not Affect |
Yes |
No |
No |
PET |
189 |
162 |
32618 |
10497 |
22121 |
Right Lung Not Affect |
No |
Yes |
No |
PET |
156 |
106 |
16536 |
7948 |
8588 |
Right Lung Not Affect |
Yes |
No |
No |
CT |
145 |
108 |
15660 |
4132 |
11528 |
Left Lung Partial Affect |
Yes |
No |
No |
PET |
99 |
79 |
7821 |
5635 |
2186 |
Left Lung Partial Affect |
Yes |
No |
Yes |
PET |
181 |
183 |
33123 |
24382 |
8741 |
Right Lung Partial Affect |
No |
Yes |
No |
CT |
255 |
165 |
42075 |
22429 |
19646 |
Right Lung Partial Affect |
Yes |
No |
No |
CT |
391 |
331 |
129421 |
51457 |
77964 |
Left Lung Partial Affect |
Yes |
No |
No |
PET |
239 |
211 |
50429 |
31583 |
18846 |
Right Lung Partial Affect |
Yes |
No |
Yes |
Table 2: Acceptance and Rejection Rate
Threshold |
FAR (Number of FAR /Total Acceptance) |
FRR (Number of FRR/Total Acceptance) |
Thresh1 |
0.1176 |
0.0588 |
Thresh2 |
0.0588 |
0.0588 |
Thresh3 |
0.0588 |
0.1176 |
The above table1 shows the result of 17 images which are used for testing. Out of 17 images 14 images are tested correctly which is mentioned as yes but 3 images are not identified properly that is mentioned as no in the above table. The table includes the total pixel value of the image along with the height, width, true pixels and false pixels values which are considered as the parameters to test the algorithm.
The above table 2 shows the false acceptance rate and false rejection rate for three different threshold values which are derived from the testing images. The false acceptance rate is the measure of the image in which non-cancerous tissue are non-cancerous. It is been calculated by the number of cases falsely accepted divided by the total number of acceptances. The false rejection rate is the measure of an image in which non-cancerous tissue tends to be cancerous. It is been calculated by the number of cases falsely rejected divided by the total number of acceptances. The table includes the threshold value, FAR and FRR.
CONCLUSION:
In this paper, the identification of lung cancer is implemented by using various image processing methods. The proposed algorithm uses both CT and PET lung image for the identification of lung cancer. Various steps are included in the proposed algorithm like pre-processing, binarization, segmentation and thresholding. Dataset used in this research work, is an imbalanced dataset as the images under different categories are not uniformly distributed. Also, the number of images used for training can be increased so that the threshold values can be calculated more accurately. Thus the performance of the proposed algorithm can be improved by increasing the number of images in the dataset and by using a balanced dataset.
Various methods of processing medical images can be explored to improve the performance of the proposed algorithm. The proposed algorithm can be extended to detect cancer in different organs like liver, breast etc. and its performance can be evaluated. An automatic resizing mechanism for images can be improved for the efficient detection of lung cancer. Also, by using different segmentation methods, the performance of the proposed algorithm can be measured. The proposed algorithm can be enhanced to handle MRI images.
REFERENCES:
1. Prathamesh Gawade and R.P. Chauhan, “Detection of lung cancer using image processing techniques,” International Journal of Advanced Technology and Engineering Exploration, Volume (3), Issue (1), 2016.
2. Munimanda Prem Chander, M. Venkateshwara Rao, T. V. Rajinikanth, “Detection of Lung Cancer Using Digital Image Processing Techniques: A Comparative Study,” International Journal of Medical Imaging, Volume (5), Issue (4), 2017.
3. B. Muthazhagan and T. Ravi, “An Early Diagnosis of Lung Cancer Disease Using Data Mining and Medical Image Processing Methods: A Survey,” Middle-East Journal of Scientific Research, Volume (5), Issue (4), 2016.
4. Arvind Kumar Tiwari, “Prediction of Lung Cancer Using Image Processing Techniques: A Review,” An International Journal (ACII), Volume (3), Issue (1), 2016.
5. Mokhled S.AL-Tarawneh, “Lung Cancer Detection using Image Processing Technique,” Leonardo Electronic Journal of Practices and Technologies, Issue (20), 2012.
6. Prashant Naresh and Dr. Rajashree Shettar, “Image Processing and Classification Techniques for Early Detection of Lung Cancer for Preventive Health Care: A Survey,” Int. J. of Recent Trends in Engineering & Technology, Volume (11), 2014.
7. Jaspinder Kaur, Nidhi Garg, Daljeet Kaur, “A survey of Lung Cancer Detection Techniques on CT scan Images,” International Journal of Scientific and Engineering Research, Volume (5), Issue (6), 2014.
8. Er. Nisha, Er. Lavina Maheshwari, “Lung Tumor Detection by Using Image Segmentation and Neural Network,” International Journal of Enhanced Research in Science, Technology and Engineering, Volume (4), Issue (12), 2015.
9. G. Vijaya, A. Suhasini, R. Priya, “,” An International Journal (ACII), Volume (3), Issue (1), 2016.
10. G. Niranjana, Dr. M. Ponnavaikko, “A Review on Image Processing Methods in Detecting Lung Cancer using CT Images,” International Conference on Technical Advancements in Computers and Communications, 2017.
11. Shraddha G. Kulkarni, Sahebrao B. Bagal, “Lung Cancer Tumor Detection Using Image Processing and Soft Computing Techniques,” International Conference on Recent Research Development in Science, Engineering and Management, 2016.
12. Neha Panpaliya, Neha Tadas, Surabhi Bobade, Rewti Aglawe, Akshay Gudadhe, “A Survey on Early Detection and Prediction of Lung Cancer,” An International Journal of Computer Science and Mobile Computing, Volume (4), Issue (1), 2015.
13. Md. Badrul Alam Miah and Mohammad Abu Yousuf, “Detection of Lung Cancer form CT Image Using Image Processing and Neural Network,” International Conference on Electrical Engineering and Information Communication Technology, 2015.
14. Mukesh K. Nag, Satish Patel, Rajnikant Panik, Shikha Shrivastava, Sanjay J. Daharwal, Manju R. Singh, Deependra Singh “Lung Cancer Targeting: A Review,” Research Journal of Pharmacy and Technology, Volume (6), Issue (11), 2013.
15. R. Pandian, Dr. Lalitha Kumari “C T Image for Lung Cancer Identification,” Research Journal of Pharmacy and Technology, Volume (9), Issue (12), 2016.
16. A.V.S.N. Murty, B.N. Jagadesh, K. Bhagavan, S. Satyanarayana “A Comparative Study of Various Edge Enhancement Filters in Spatial Domain,” Research Journal of Pharmacy and Technology, Volume (9), Issue (12), 2016.
17. S. Syes Abdul Syed, T. Senthil Kumaran “FCM based Segmentation for Medical Images,” Research Journal of Pharmacy and Technology, Volume (10), Issue (12), 2017.
18. Shaik Naseera “Client-Server Architecture for Embedding Patient Information on X-Ray Images,” Research Journal of Pharmacy and Technology, Volume (9), Issue (9), 2016.
19. Deepak Rao Khadatkar and Yogesh Rathore “An Efficient and Useful Hybrid Approach for Detection of Lung Cancer,” Research Journal of Pharmacy and Technology, Volume (2), Issue (4), 2011.
20. T. Sudhakar, Bethanney Janney. J, Haritha. D, Juliet Sahaya. M, Parvathy. V “Automatic Detection and Classification of Brain Tumor using Image Processing Techniques,” Research Journal of Pharmacy and Technology, Volume (10), Issue (11), 2017.
21. Swarnakala, Natarajah Srikumaran “Brain Tumor Segmentation by EM Algorithm,” Research Journal of Pharmacy and Technology, Volume (10), Issue (9), 2017.
22. Shyamala Devi M, Sruthi A. N, Saranya Jothi C “MRI Liver Tumor Classification Using Machine Learning Approach and Structure Analysis,” Research Journal of Pharmacy and Technology, Volume (11), Issue (2), 2018.
23. B.D. Venkatramana Reddy and T. Jayachandra Prasad “Color-Texture Image Segmentation Algorithms based on Hypercomplex Gabor Analysis,” Research Journal of Pharmacy and Technology, Volume (2), Issue (2), 2011.
Received on 15.11.2018 Modified on 31.12.2018
Accepted on 21.01.2019 © RJPT All right reserved
Research J. Pharm. and Tech. 2019; 12(5):2109-2115.
DOI: 10.5958/0974-360X.2019.00350.0